Geographic Information Retrieval and Visualization of Online Unstructured Documents
نویسندگان
چکیده
Newspapers, travel narratives, blogs, books and the Internet hold a huge amount of geographic information that can be extracted in order to provide visual exploration. Also, the understanding of place references involves knowledge of the document context. In this way, the study of tools for disambiguation is needed. For the automatic annotation of time and location, both shared world knowledge and document context needs to be captured. This paper is centered on analyzing online unstructured documents: travel narratives and online newspapers. Our approach is based on the exploration of tools able to make automatically the disambiguation of placenames. In this case, we have used a Geoparsing Web Service to extract geographic coordinates from the online unstructured documents. Once geographic coordinates are extracted by using eXtensible Markup Language (XML) we draw the geo-positions and link documents into a map image in order to visualize textual information.
منابع مشابه
دیداری کردن نتایج جستوجو در فرایند بازیابی اطلاعات
Purpose: One of the most effective ways to achieve optimum information retrieval is through visualization of Information. Search strategies, probing skills, querying of information needs and analysis of information play a significant role in the accessing of necessary and useful information. Besides the factors mentioned above, information visualization can increase the availability level of in...
متن کاملIndexation spatiale et temporelle basée sur un principe de "tuilage" : contribution à la recherche d'information géographique dans des documents textuels faiblement structurés
Most of search engines process users’ information needs by retrieving documents from pre-built term-based indexes. Such approaches are limited regarding particular contexts or specific retrieval criteria. Our contribution concerns geographical information retrieval (GIR) and proposes to exploit both spatial and temporal facets to extend classical thematic engines in order to parse unstructured ...
متن کاملSpyglass: A System for Ontology Based Document Retrieval and Visualization
This paper describes the Spyglass tool, which is designed to help analysts explore very large collections of unstructured text documents. Spyglass uses a domain ontology to index documents, and provides retrieval and visualization services based on the ontology and the resulting index. The ontology based approach allows analysts to share information and helps to ensure consistency of results. T...
متن کاملMining Association Rules from Unstructured Documents
This paper presents a system for discovering association rules from collections of unstructured documents called EART (Extract Association Rules from Text). The EART system treats texts only not images or figures. EART discovers association rules amongst keywords labeling the collection of textual documents. The main characteristic of EART is that the system integrates XML technology (to transf...
متن کاملGeographically Aware Web Text Mining
Text mining and search have become important research areas over the past few years, mostly due to the large popularity of the Web. A natural extension for these technologies is the development of methods for exploring the geographic context of Web information. Human information needs often present specific geographic constraints. Many Web documents also refer to specific locations. However, re...
متن کامل